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The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 
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5) 0 Claim(s) is/are allowed. 

6) K Claim(s) 1-26 is/are rejected. 
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DETAILED ACTION 



1. This office action is responsive to application 09/715,552, filed 1 1/17/2000. 

2. Claims 1-26 are presented for examination. 



Information Disclosure Statement 

3. Applicant is respectfully reminded of the duty to fully disclose infonnation under 37 
CFR 1 .56 all pertinent information and material pertaining to the patentability of 
applicant's claimed invention. 

Drawings 

4. Figure 1 should be designated by a legend such as —Prior Art— because only that 
which is old is illustrated. See MPEP § 608.02(g). A proposed drawing correction or 
corrected drawings are required in reply to the Office action to avoid abandonment of 
the application. The objection to the drawings will not be held in abeyance. 



Specification 

5. The disclosure is objected to because of the following informalities: 

Page 1, line 15, "keys" should recite —key—. 
Appropriate correction is required. 

6. The disclosure is objected to because of the following informalities: 

Page 5, line 14, "According to aspect" should recite —According to one aspect—. 
Appropriate correction is required. 
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Claim Rejections - 35 USC § 102 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States, 

7. Claims 1-4, 13-17, and 26 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Gal et al., USPN 5,729,732 (hereinafter Gal). 
Regarding Claim 1: 

Gal discloses a method for distributing data items from a particular set of data into a 
plurality of buckets based on distribution keys associated with said data items (Gal, see 
Abstract, "A method is described for operating a computer to sort a set of data records 
each having an associated key for governing the sort process"), the method comprising 
the steps of: 

randomly selecting data items from said particular set of data to produce a 
sampled set of data items(Gal, col.3, lines 51-52, "The random sampling can be achieved, 
for example, by taking a predetermined set of n indices"); 

determining a plurality of ranges based on the distribution keys associated 
with the sampled set of data items (Gal, see Abstract, "determining a range for the key 
values by sampling the key values"); 

assigning said plurality of ranges to said plurality of buckets (Gal, see Abstract, 
"defining a plurality of buckets, each bucket corresponding to a respective one of a 
plurality M of subintervals in the range"); and 
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distributing each data item in said particular set of data to the bucket that has 
been assigned the range into which falls the distribution key of the data item (Gal, see 
Abstract, "distributing the keys among the buckets by determining directly from each key 
value the index of the subinterval into which the key value falls"). 
Regarding Claim 2: 

Gal discloses randomly selecting data items from each subset of a plurality of subsets of 
said particular set of data (Gal, col. 3, lines 40-41 and 46, "The file to be sorted comprises 
N records each of which has an associated key" and "A random sample of the keys is 
taken from the file y.sub.l, y.sub.2,. . . y.sub.n" ). 
Regarding Claim 3: 

Gal discloses randomly selecting data items from each partition of a partitioned table 
(Gal, col.3, lines 51-54, "The random sampling can be achieved, for example, by taking a 
predetermined set of n indices, for example from a pseudo-random table, and picking the 
corresponding elements"). 
Regarding Claim 4: 

Gal discloses randomly selecting data items from subsets of data, stored in buffers in 
volatile memory (i.e. RAM), that represent results of one or more previously performed 
operations (Gal, see FIG.l, col.3, lines 12-13, "the data processing system which maybe 
utilized for implementing the method and system of the present invention includes a 
processor 10, a random access memory (RAM) 12, a read only memory (ROM) 14, at 
least one non- volatile storage device 15, a computer display monitor 16 and a keyboard 
18"). 
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Regarding Claim 13: 

Gal discloses determining ranges that contain an approximately equal amount of 
distribution keys associated with said sampled set of data items (Gal, col.2. lines 7-19, 
"the index of the subinterval into which each key falls is determined directly from the key 
value. This means that the distribution of each key into the respective bucket can be 
performed in a time, which does not depend on the number of buckets used in the 
distribution. The subintervals are equal"). 

8. Claims 14-17 and 26 are claims to a computer readable medium carrying 
instructions, which performs the steps of the method of claims 1-4 and 13. Computer 
readable medium include CDs, floppy disks, hard drives, memory, etc. Gal teaches a 
computer implemented process, thus it is inherent that the program accomplishing the 
procedures must be carried or stored on a computer readable medium to enable the 
computer to function in the manner taught by Gal. Therefore, claims 14-17 and 26 are 
rejected for the reasons set forth above and under the same rationale as claims 1-4 and 13. 
Regarding Claim 14: 
Gals discloses: 

distributing data items from a particular set of data into a plurality of buckets 
based on distribution keys associated with said data items (Gal, see Abstract, "A method 
is described for operating a computer to sort a set of data records each having an 
associated key for governing the sort process"), 

randomly selecting data items from said particular set of data to produce a 
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sampled set of data items (Gal, col. 3, lines 51-52, "The random sampling can be 
achieved, for example, by taking a predetermined set of n indices"); 

determining a plurality of ranges based on the distribution keys associated 
with the sampled set of data items (Gal, see Abstract, "determining a range for the key 
values by sampling the key values"); 

assigning said plurality of ranges to said plurality of buckets (Gal, see Abstract, 
"defining a plurality of buckets, each bucket corresponding to a respective one of a 
plurality M of subintervals in the range"); and 

distributing each data item in said particular set of data to the bucket that has 
been assigned the range into which falls the distribution key of the data item (Gal, see 
Abstract, "distributing the keys among the buckets by determining directly from each key 
value the index of the subinterval into which the key value falls"). 
Regarding Claim 15: 

Gal discloses randomly selecting data items from each subset of a plurality of subsets of 
said particular set of data (Gal, col.3, lines 40-41 and 46, "The file to be sorted comprises 
N records each of which has an associated key" and "A random sample of the keys is 
taken from the file y.sub.l, y.sub.2,. . . y.sub.n" ). 
Regarding Claim 16: 

Gal discloses randomly selecting data items from each partition of a partitioned table 
(Gal, col.3, lines 51-54, "The random sampling can be achieved, for example, by taking a 
predetermined set of n indices, for example from a pseudo-random table, and picking the 
corresponding elements"). 
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Regarding Claim 17: 

Gal discloses randomly selecting data items from subsets of data, stored in buffers in 
volatile memory, that represent results of one or more previously performed 
operations (Gal, see FIG. 1, col. 3, lines 12-13, "the data processing system v^hich may be 
utilized for implementing the method and system of the present invention includes a 
processor 10, a random access memory (RAM) 12, a read only memory (ROM) 14, at 
least one non-volatile storage device 15, a computer display monitor 16 and a keyboard 
18"). 

Regarding Claim 26: 

Gal discloses determining ranges that contain an approximately equal amount of 
distribution keys associated with said sampled set of data items (Gal, coL2. lines 7-19, 
"the index of the subinterval into which each key falls is determined directly from the key 
value. This means that the distribution of each key into the respective bucket can be 
performed in a time, which does not depend on the number of buckets used in the 
distribution. The subintervals are equal"). 



Claim Rejections - 35 USC §103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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9. Claims 5, 7, 12, 18, 20, and 25 rejected under 35 U.S.C. 103(a) as being 
unpatentable over Gal et al., USPN 5,729,732 (hereinafter Gal) in view of Ogi (USPN 
5,854,938). 
Regarding Claim 5: 

Gal discloses a method for evenly distributing data items to corresponding 
buckets (Gal, see Abstract, "A method is described for operating a computer to sort a set 
of data records each having an associated key for governing the sort process, the method 
comprising determining a range for the key values by sampling the key values; defining a 
plurality of buckets, each bucket corresponding to a respective one of a plurality M of 
subintervals in the range"). 

However, Gal does not particularly disclose processing the buckets with plural 
processors concurrently operating in parallel to execute a task. 

Ogi discloses assigning the plurahty of buckets to a plurality of processes (Ogi, 
see FIG. 3); and causing each process of said plurality of processes to perform, in parallel 
with the other processes of said plurality of processes, an operation on the data items 
contained in any buckets assigned to the process (Ogi, see FIG.3). 

Gal and Ogi are analogous art because they are from the same field of endeavor in 
parallel operations and data distribution. It would have been obvious for one of ordinary 
skill in the art at the time the invention was made to modify the method for sorting a set 
of data in a computer system using buckets having associated key values within a range 
of Gal such that it utilizes the parallel computer system having plural processors 
concurrently operating in parallel to execute tasks as disclosed by Ogi. 
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One of ordinary skill in the art would be motivated to do so because it increases 
data management efficiency by optimally storing data in a plurality of database storage 
areas and decreasing the time it takes for a computer system to perform a plurality of 
tasks. This improvement over the prior art would result in efficient load balancing 
thereby optimizing performance in database systems. 
Regarding Claim 7: 

Ogi discloses the particular set of data is durably stored on a plurality of durable storage 
units (ogi, see FIG.3, element 24); and randomly selecting durable storage units from said 
plurality of durable storage units and using the data items stored on said randomly 
selected durable storage units as the sampled set of data items (Ogi, see FIG.3, col. 14, 
lines 41-43, "Inside each of the bucket groups, plural buckets (of n sorts here) are stored 
at random regardless of the order of generating the bucket in a level of tuples"). 
Regarding Claim 12: 

Ogi discloses wherein said operation is specified in a database command, 
the method further comprising receiving with said database command data that 
indicates how much of said particular set of data to randomly select to produce said 
sampled set of data items (Ogi, col.3, lines 30-37, "Meanwhile, a work of dividing a tuple 
group into plural groups (buckets) using a grouping function (classification using a value 
of hashing or a specific column, for example) is extremely general in an RDB process. 
For instance, "Group By" phrase in the SQL statement is to clearly demand for a group 
dividing process, or hash join which is a typical system of join operations with the above 
structure features group sorting of tuples with a hash function"). 
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10. Claims 18, 20 and 25 are claims to a computer readable medium carrying 
instructions, which performs the steps of the method of claims 5, 7 and 12. Computer 
readable medium include CDs, floppy disks, hard drives, memory, etc. Gal teaches a 
computer implemented process, thus it is inherent that the program accomplishing the 
procedures must be carried or stored on a computer readable medium to enable the 
computer to function in the manner taught by Gal. Therefore, claims 18, 20 and 25 are 
rejected for the reasons set forth above and under the same rationale as claims 5, 7 and 
12. 

Regarding Claim 18: 

Ogi discloses assigning the plurality of buckets to a plurality of processes (Ogi, see 
FIG.3); and causing each process of said plurality of processes to perform, in parallel 
with the other processes of said plurality of processes, an operation on the data items 
contained in any buckets assigned to the process (Ogi, see FIG.3). 
Regarding Claim 20: 

Ogi discloses the particular set of data is durably stored on a plurality of durable storage 
units (ogi, see FIG.3, element 24); and randomly selecting durable storage units from said 
plurality of durable storage units and using the data items stored on said randomly 
selected durable storage units as the sampled set of data items (Ogi, see FIG.3, col. 14, 
lines 41-43, "Inside each of the bucket groups, plural buckets (of n sorts here) are stored 
at random regardless of the order of generating the bucket in a level of tuples")- 



Regarding Claim 25: 

Ogi discloses wherein said operation is specified in a database command, 
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the method further comprising receiving with said database command data that 
indicates how much of said particular set of data to randomly select to produce said 
sampled set of data items (Ogi, col.3, lines 30-37, "Meanwhile, a work of dividing a tuple 
group into plural groups (buckets) using a grouping function (classification using a value 
of hashing or a specific column, for example) is extremely general in an RDB process. 
For instance, "Group By" phrase in the SQL statement is to clearly demand for a group 
dividing process, or hash join which is a typical system of join operations with the above 
structure features group sorting of tuples with a hash function"). 



11. Claims 6, and 19 rejected under 35 U.S.C. 103(a) as being unpatentable over the 
prior art as applied to claim 1 above and further in view of Marks USPN 5,748,844. 
Regarding Claim 6: 

Gal discloses a method for randomly selecting data items from each subset of a 
plurality of subsets of said particular set of data (Gal, col.3, lines 40-41 and 46, "The file 
to be sorted comprises N records each of which has an associated key" and "A random 
sample of the keys is taken from the file y.sub.l, y.sub.2,. . . y.sub.n" ). 

However, Gal does not particularly disclose selecting a distinct random seed 
for each subset of the plurality of subsets of said particular set of data. 

Marks discloses selecting a distinct random seed for each subset of the plurality of 
subsets of said particular set of data (Marks, FIG.3, element 32, col.3, lines 44-46 "the 
seed-growth heuristic initially assigns a small number of randomly chosen nodes to each 
part of the partition; these are the seed nodes"). 
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Gal and Marks are analogous art because they are from the same field of endeavor 
in partitioning that can be applied to database design and parallel processing. Generally, 
random number generators have to be "seeded" with an initial seed. The use of random 
seed generators from which a "first" random number is derived is well known in the art. 

It would have been obvious for one of ordinary skill in the art at the time the 
invention was made to modify the method for randomly selecting data items from each 
subset of a plurality of subsets such that it utilizes the random seed generator of Marks. 

One of ordinary skill in the art would be motivated to do so because it assures the 
best random sampling of data for partitioning thereby yielding a superior sampling 
operation performed in parallel. 
Regarding Claim 19: 

Claim 19 is a claim to a computer readable medium carrying instructions, which performs 
the steps of the method of claim 6. Computer readable medium include CDs, floppy 
disks, hard drives, memory, etc. Gal teaches a computer implemented process, thus it is 
inherent that the program accomplishing the procedures must be carried or stored on a 
computer readable medium to enable the computer to fianction in the manner taught by 
Gal. Therefore, claim 19 is rejected for the reasons set forth above and under the same 
rationale as claim 6. 



12. Claims 8, 10, 21, and 23 rejected under 35 U.S.C. 103(a) as being unpatentable 
over the prior art as applied to claim 1 above, and in further view of Couch et al. USPN 
6,604,096 (hereinafter Couch). 
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Regarding Claim 8: 

Gal discloses randomly selecting data items from said particular set of data to 
produce a sampled set of data items(Gal, col.3, lines 51-52, "The random sampling can 
be achieved, for example, by taking a predetermined set of n indices"). 

Gal does not particularly disclose selecting a specified percentage of data items in 
said particular set of data. 

Couch discloses selecting a specified percentage of data items in said particular 
set of data (Couch, see FIG.6, element 120, col. 9, lines 49-51, "percentage designation 
control 120 may be use to select the amount of data"). 

Gal and Couch are analogous art because they are from the same field of endeavor 
in database management systems. It would have been obvious for one of ordinary skill in 
the art at the time the invention was made to implement the method for randomly 
selecting data items from said particular set of data to produce a sampled set of data items 
such that it incorporates selecting a specified percentage of data items in said particular 
set of data as disclosed by Marks. 

One of ordinary skill in the art would be motivated to do so because it assures the 
even distribution of data resulting in improved quality of partitioning. 
Regarding Claim 10: 

Couch discloses the method of Claim 8 further comprising the step of receiving, from a 
user, data that specifies said percentage (Couch, col. 14, lines 1-2, "configured to receive 
a user selection of a percentage of the data"). 
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13. Claims 21 and 23 are claims to a computer readable medium carrying 
instractions, which performs the steps of the method of claims 8 and 10. Computer 
readable medium include CDs, floppy disks, hard drives, memory, etc. Gal teaches a 
computer implemented process, thus it is inherent that the program accomplishing the 
procedures must be carried or stored on a computer readable medium to enable the 
computer to function in the manner taught by Gal. Therefore, claims 21 and 23 are 
rejected for the reasons set forth above and under the same rationale as claims 8 and 10. 
Regarding Claim 21: 

Gal discloses randomly selecting data items from said particular set of data to 
produce a sampled set of data items(Gal, col. 3, lines 51-52, "The random sampling can 
be achieved, for example, by taking a predetermined set of n indices"). 

Gal does not particularly disclose selecting a specified percentage of data items in 
said particular set of data. 

Couch discloses selecting a specified percentage of data items in said particular 
set of data (Couch, see FIG.6, element 120, col.9, lines 49-51, "percentage designation 
control 120 may be use to select the amount of data"). 

Gal and Couch are analogous art because they are from the same field of endeavor 
in database management systems. It would have been obvious for one of ordinary skill in 
the art at the time the invention was made to implement the method for randomly 
selecting data items from said particular set of data to produce a sampled set of data items 
such that it incorporates selecting a specified percentage of data items in said particular 
set of data as disclosed by Marks. 
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One of ordinary skill in the art would be motivated to do so because it assures the 
even distribution of data resulting in improved quality of partitioning. 
Regarding Claim 23: 

Couch discloses the method of Claim 8 further comprising the step of receiving, from a 
user, data that specifies said percentage (Couch, coll 4, lines 1-2, "configured to receive 
a user selection of a percentage of the data"). 

14. Claims 9, 1 1, 22, and 24 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over the prior art as applied to claim 7 above, and further in view of Couch 
et al. USPN 6,604,096 (hereinafter Couch). 
Regarding Claim 9: 

Keeping in mind what Gal and Ogi discloses as mentioned above, the reference of 
Gal modified by Ogi does not teach selecting a specified percentage of the plurality of 
durable storage units that are storing said particular set of data. 

Couch discloses selecting a selecting a specified percentage of the plurality of 
durable storage units that are storing said particular set of data (Couch, col. 5, lines 13-15, 
"The operational data may be collected as a single data set, or may be distributed over 
different locations including over different storage devices")- 

It would have been obvious for one of ordinary skill in the art at the time the 
invention was made to implement the method for randomly selecting data items from said 
particular set of data to produce a sampled set of data items such that it incorporates 
selecting a selecting a specified percentage of the plurality of durable storage units that 
are storing said particular set of data as disclosed by Marks. 
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One of ordinary skill in the art would be motivated to do so because it allows 
flexible storage of data thereby improving performance of the database management 
system. 

Regarding Claim 11: 

Couch discloses the method of Claim 9 further comprising the step of receiving , from a 
user, data that specifies said percentage (Couch, col. 14, lines 1-2, "configured to receive 
a user selection of a percentage of the data"). 

15. Claims 22 and 24 are claims to a computer readable medium carrying 
instructions, which performs the steps of the method of claims 9 and 11. Computer 
readable medium include CDs, floppy disks, hard drives, memory, etc. Gal teaches a 
computer implemented process, thus it is inherent that the program accomplishing the 
procedures must be carried or stored on a computer readable medium to enable the 
computer to function in the manner taught by Gal. Therefore, claims 22 and 24 are 
rejected for the reasons set forth above and under the same rationale as claims 9 and 1 1 . 
Regarding Claim 22: 

Keeping in mind what Gal and Ogi discloses as mentioned above, the reference of 
Gal modified by Ogi does not teach selecting a specified percentage of the plurality of 
durable storage units that are storing said particular set of data. 

Couch discloses selecting a selecting a specified percentage of the plurality of 
durable storage units that are storing said particular set of data (Couch, col. 5, lines 13-15, 
"The operational data may be collected as a single data set, or may be distributed over 
different locations including over different storage devices"). 
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It would have been obvious for one of ordinary skill in the art at the time the 
invention was made to implement the method for randomly selecting data items from said 
particular set of data to produce a sampled set of data items such that it incorporates 
selecting a selecting a specified percentage of the plurality of durable storage units that 
are storing said particular set of data as disclosed by Marks. 

One of ordinary skill in the art would be motivated to do so because it allows 
flexible storage of data thereby improving performance of the database management 
system. 

Regarding Claim 24: 

Couch discloses the method of Claim 9 further comprising the step of receiving, from a 
user, data that specifies said percentage (Couch, col. 14, lines 1-2, "configured to receive 
a user selection of a percentage of the data"). 



Conclusion 

16. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. See Form PTO-892. 

17. Any inquiry conceming this communication or earlier communications from the 
examiner should be directed to Anh T Nguyen whose telephone number is (703) 305-8649. 
The examiner can normally be reached on Monday-Friday from 7:00 AM to 4:30 PM. 

If attempts to reach the examiner by telephone are unsuccessfiil, the examiner's 
supervisor, William Grant, can be reached on (703) 308- 11 08. The fax phone number for 
the organization where this appHcation or proceeding is assigned is (703) 872-9306. 
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Any inquiry of a general nature or relating to the status of this application or proceeding 
should be directed to the receptionist whose telephone number is (703) 306-5484. 



Anh T.Nguyen p/i 
Art Unit 2127 
November 24, 2003 




WILLIAM GRANT 
SUPERVISORY PATENT EXAMINER 

TECHNOLOGY CEffTER 2100 




